skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Chu, Huaiyuan"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. The rising computational and energy demands of deep learning, particularly in large-scale architectures such as foundation models and large language models (LLMs), pose significant challenges to sustainability. Traditional gradient-based training methods are inefficient, requiring numerous iterative updates and high power consumption. To address these limitations, we propose a hybrid framework that combines hierarchical decomposition with FPGA-based direct equation solving and incremental learning. Our method divides the neural network into two functional tiers: lower layers are optimized via single-step equation solving on FPGAs for efficient and parallelizable feature extraction, while higher layers employ adaptive incremental learning to support continual updates without full retraining. Building upon this foundation, we introduce the Compound LLM framework, which explicitly deploys LLM modules across both hierarchy levels. The lower-level LLM handles reusable representation learning with minimal energy overhead, while the upper-level LLM performs adaptive decision making through energy-aware updates. This integrated design enhances scalability, reduces redundant computation, and aligns with the principles of sustainable AI. Theoretical analysis and architectural insights demonstrate that our method reduces computational costs significantly while preserving high model performance, making it well-suited for edge deployment and real-time adaptation in energy-constrained environments. 
    more » « less
    Free, publicly-accessible full text available June 30, 2026
  2. Different from traditional tedious CPU-GPU-based training algorithms using gradient descent methods, the software-FPGA co-designed learning algorithm is created to quickly solve a system of linear equations to directly calculate optimal values of hyperparameters of the green granular neural network (GGNN). To reduce both CO2 emissions and energy consumption effectively, a novel green granular convolutional neural network (GGCNN) is developed by using a new classifier that uses GGNNs as building blocks with new fast software-FPGA co-designed learning. Initial simulation results indicate that the FPGA equation solver code runs faster than the Python equation solver code. Therefore, implementing the GGCNN with software-FPGA co-designed learning is feasible. In the future, The GGCNN will be evaluated by comparing with a convolutional neural network with the traditional software-CPU-GPU-based learning in terms of speeds, model sizes, accuracy, CO2 emissions and energy consumption by using popular datasets. New algorithms will be created to divide the inputs to different input groups for building different GGNNs to solve the curse of dimensionality. 
    more » « less
  3. A novel green granular neural network (GGNN) with new fast software-FPGA co-designed learning is developed to reduce both CO2 emissions and energy consumption more effectively than popular neural networks with the traditional software-CPU-GPU-based learning. Different from traditional tedious CPU-GPU-based training algorithms using gradient descent methods and other methods such as genetic algorithms , the software-FPGA co-designed training algorithm may quickly solve a system of linear equations to directly calculate optimal values of hyperparameters of the GGNN. Initial simulation results indicates that the FPGA equation solver code ran faster than the Python equation solver code. Therefore, implementing the GGNN with software-FPGA co-designed learning is feasible. In addition, the shallow high-speed GGNN is explainable because it can generate interpretable granular If-Then rules. In the future, The GGNN will be evaluated by comparing with other machine learning models with traditional software-based learning in terms of speeds, model sizes, accuracy, CO2 emissions and energy consumption by using popular datasets. New algorithms will be created to divide the inputs to different input groups that will be used to build different small-size GGNNs to solve the curse of dimensionality. Additionally, the explainable green granular convolutional neural network will be developed by using the GGNNs as basic building blocks to efficiently solve image recognition problems. 
    more » « less